Generalized Bandit Problems

نویسنده

  • Rangarajan K. Sundaram
چکیده

1 The questions addressed in this paper grew out of my work with Jeff Banks on bandit problems and their applications (Banks and Sundaram (1992a, 1992b, 1994)) and owe much to many discussions I had with him on this subject. I also had the benefit of several discussions with Andy McLennan, especially regarding the material in Sections 4 and 6 of this paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Exploration vs Exploitation Trade-Off in Bandit Problems: An Empirical Study

We compare well-known action selection policies used in reinforcement learning like ǫ-greedy and softmax with lesser known ones like the Gittins index and the knowledge gradient on bandit problems. The latter two are in comparison very performant. Moreover the knowledge gradient can be generalized to other than bandit problems.

متن کامل

Enhancing Evolutionary Optimization in Uncertain Environments by Allocating Evaluations via Multi-armed Bandit Algorithms

Optimization problems with uncertain fitness functions are common in the real world, and present unique challenges for evolutionary optimization approaches. Existing issues include excessively expensive evaluation, lack of solution reliability, and incapability in maintaining high overall fitness during optimization. Using conversion rate optimization as an example, this paper proposes a series...

متن کامل

Parametric Bandits: The Generalized Linear Case

We consider structured multi-armed bandit problems based on the Generalized Linear Model (GLM) framework of statistics. For these bandits, we propose a new algorithm, called GLM-UCB. We derive finite time, high probability bounds on the regret of the algorithm, extending previous analyses developed for the linear bandits to the non-linear case. The analysis highlights a key difficulty in genera...

متن کامل

Four proofs of Gittins' multiarmed bandit theorem

We study four proofs that the Gittins index priority rule is optimal for alternative bandit processes. These include Gittins’ original exchange argument, Weber’s prevailing charge argument, Whittle’s Lagrangian dual approach, and Bertsimas and Niño-Mora’s proof based on the achievable region approach and generalized conservation laws. We extend the achievable region proof to infinite countable ...

متن کامل

Online Linear Optimization with Sparsity Constraints

We study the problem of online linear optimization with sparsity constraints in the 1 semi-bandit setting. It can be seen as a marriage between two well-known problems: 2 the online linear optimization problem and the combinatorial bandit problem. For 3 this problem, we provide two algorithms which are efficient and achieve sublinear 4 regret bounds. Moreover, we extend our results to two gener...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003